🎯 Reinforcement Learning - 2795725893 · Scour

Reward Modeling for Reinforcement Learning-Based LLM Reasoning: Design, Challenges, and Evaluation

arxiv.org·1d

Risk-sensitive reinforcement learning using expectiles, shortfall risk and optimized certainty equivalent risk

arxiv.org·1d

check out this article on Reinforcement Learning with R: Origins, Real-Life Applications, and Practical Implementation

dev.to·2d·

Discuss: DEV

A multi-agent reinforcement learning approach to autonomous aircraft taxiing with taxiing time, fuel consumption, and emission optimization

sciencedirect.com·22h

Show HN: Fighting the War Against Expensive Reinforcement Learning

cadenza-landing-qtu7gbjwb-akshparekh123-3457s-projects.vercel.app·4h·

Discuss: Hacker News

Recursive self-improvement from AI models

marginalrevolution.com·1d·

Discuss: Hacker News

🎨Multimodal AI

A training principle for drifting models

breno.bearblog.dev·45m

ashworks1706/rlhf-from-scratch: A theoretical and practical deep dive into Reinforcement Learning with Human Feedback and it’s applications in Large Language Models from scratch.

github.com·2d·

Discuss: Hacker News

JupyterPS/VBAF: Visual Business Automation Framework - PowerShell-based reinforcement learning for education and business automation

github.com·1d·

Discuss: Hacker News

Observe emergent behavior in autonomous multi-agent LLM networks

agents.glide2.app·1d·

Discuss: Hacker News

Researchers propose a self-distillation fix for ‘catastrophic forgetting’ in LLMs

infoworld.com·1h

Backtracking Algorithms

algos.khourani.com·1d

Robotics Motion Learning: Training Linked Robot Arms with Kuramoto Models

hackernoon.com·20h

🎨Multimodal AI

Show HN: A minimal online decision maker

decisionmaker.online·22h·

Discuss: Hacker News

Generalized Lanczos method for systematic optimization of neural-network quantum states

link.aps.org·1h

Architectural and Mathematical Foundations of Machine Learning: A Rigorous Synthesis of Theory, Geometry, and Implementation

chizkidd.github.io·22h·

Discuss: Hacker News

Behavioral economics-oriented energy storage investment analysis: A holistic decision support model with advanced fuzzy techniques

sciencedirect.com·19h

Magic Tricks, Moats, and the Three-Body Problem of AI Networks

caseyaccidental.com·20h

🎨Multimodal AI

Steps to set up the game

meapps.itch.io·19h

Multi AI Agent Systems with crewAI

deeplearning.ai·38m

Loading more...